Skip to content

Implement table & tree disk usage statistics#17169

Open
shuwenwei wants to merge 67 commits intomasterfrom
table_disk_usage_statistics_with_cache
Open

Implement table & tree disk usage statistics#17169
shuwenwei wants to merge 67 commits intomasterfrom
table_disk_usage_statistics_with_cache

Conversation

@shuwenwei
Copy link
Member

@shuwenwei shuwenwei commented Feb 5, 2026

Description

This PR implements disk-usage statistics collection in table level and device level for tree model. It adds the necessary data structures, background tasks, and read APIs to compute and expose disk usage metrics used by monitoring, admission control and operational tooling.

Tree Model (No Cache)

IoTDB> show disk_usage from root.test.**
+---------+----------+--------+-------------+-----------+
| Database|DataNodeId|RegionId|TimePartition|SizeInBytes|
+---------+----------+--------+-------------+-----------+
|root.test|         1|       3|            0|         70|
+---------+----------+--------+-------------+-----------+
Total line number = 1
It costs 0.959s

Implements ShowDiskUsageNode and ShowDiskUsageOperator.
• Disk usage is calculated by scanning relevant TsFiles at query time.
• Supports:
• Path pattern matching
• Time partition filtering
• Existing SQL semantics (SHOW DISK_USAGE)

Table Model (With Cache)

IoTDB:information_schema> select * from table_disk_usage
+--------+----------+-------+---------+--------------+-------------+
|database|table_name|datanode_id|region_id|time_partition|size_in_bytes|
+--------+----------+-------+---------+--------------+-------------+
|   test1|        t1|          1|        5|             0|          142|
|   test1|        t2|          1|        5|             0|            0|
|   test1|        t1|          1|        6|             0|            0|
|   test1|        t2|          1|        6|             0|           82|
+--------+----------+-------+---------+--------------+-------------+
Total line number = 4
It costs 2.821s

Table Model introduces a dedicated disk usage cache:
• TableDiskUsageCache manages all cache operations.
• A single-threaded background worker processes write, read, and maintenance tasks via an operation queue.
• Cache state is persistent across restarts.

Cached Data
• TsFile-level table size statistics
• Object file size deltas, recorded incrementally
• Periodic snapshot + delta compaction

Query Integration
• Exposes statistics via information_schema.table_disk_usage.
• Supports:
• Predicate pushdown (except on aggregated size columns)
• Limit / offset
• Parallel region-level scanning

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements comprehensive disk usage statistics collection for both Tree Model (device-based) and Table Model databases in IoTDB. The implementation provides monitoring capabilities at the table/device level and time partition level.

Changes:

  • Adds SHOW DISK_USAGE SQL statement for Tree Model with on-demand calculation
  • Implements table_disk_usage information schema table for Table Model with persistent cache
  • Introduces background task infrastructure for cache maintenance with periodic compaction
  • Adds predicate push-down and limit/offset optimization support for information schema tables

Reviewed changes

Copilot reviewed 95 out of 95 changed files in this pull request and generated 33 comments.

Show a summary per file
File Description
pom.xml Updates tsfile version to 2.2.1-260205-SNAPSHOT
TsFileID.java Adds SHALLOW_SIZE constant (contains bug)
InformationSchema.java Adds table_disk_usage schema and push-down support (contains bug)
TableDiskUsageCache*.java Core cache implementation with writer/reader classes
ShowDiskUsageNode.java Plan node for tree model disk usage queries
TableDiskUsageInformationSchemaTableScanNode.java Plan node for table model information schema scans
ShowDiskUsageOperator.java Execution operator for tree model
DiskUsageStatisticUtil.java Base utility class for disk usage calculation
IoTDBDescriptor.java Configuration support (contains bug)
Integration tests Comprehensive tests for both tree and table models

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov
Copy link

codecov bot commented Feb 6, 2026

Codecov Report

❌ Patch coverage is 53.59168% with 982 lines in your changes missing coverage. Please review.
✅ Project coverage is 39.57%. Comparing base (e4ef4db) to head (a19bd11).
⚠️ Report is 23 commits behind head on master.

Files with missing lines Patch % Lines
...ional/InformationSchemaContentSupplierFactory.java 0.00% 185 Missing ⚠️
...ecution/operator/source/ShowDiskUsageOperator.java 0.00% 77 Missing ⚠️
...e/dataregion/utils/TreeDiskUsageStatisticUtil.java 0.00% 64 Missing ⚠️
...utils/tableDiskUsageCache/TableDiskUsageCache.java 76.59% 55 Missing ⚠️
...nner/distribute/TableDistributedPlanGenerator.java 8.51% 43 Missing ⚠️
...b/queryengine/plan/planner/LogicalPlanBuilder.java 0.00% 41 Missing ⚠️
...ngine/dataregion/utils/DiskUsageStatisticUtil.java 63.96% 40 Missing ⚠️
...db/db/queryengine/plan/analyze/AnalyzeVisitor.java 0.00% 38 Missing ⚠️
...kUsageCache/tsfile/TsFileTableSizeCacheReader.java 81.12% 37 Missing ⚠️
...an/planner/plan/node/source/ShowDiskUsageNode.java 47.05% 36 Missing ⚠️
... and 39 more
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #17169      +/-   ##
============================================
+ Coverage     39.47%   39.57%   +0.09%     
  Complexity      282      282              
============================================
  Files          5098     5120      +22     
  Lines        341273   343782    +2509     
  Branches      43471    43795     +324     
============================================
+ Hits         134714   136041    +1327     
- Misses       206559   207741    +1182     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 99 out of 99 changed files in this pull request and generated 9 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


keyFile1.createNewFile();
valueFile1.createNewFile();
tempKeyFile2.createNewFile();
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method testRecoverWriter2 ignores exceptional return value of File.createNewFile.

Copilot uses AI. Check for mistakes.
keyFile1.createNewFile();
valueFile1.createNewFile();
tempKeyFile2.createNewFile();
tempValueFile2.createNewFile();
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method testRecoverWriter2 ignores exceptional return value of File.createNewFile.

Copilot uses AI. Check for mistakes.

keyFile1.createNewFile();
valueFile1.createNewFile();
keyFile2.createNewFile();
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method testRecoverWriter3 ignores exceptional return value of File.createNewFile.

Copilot uses AI. Check for mistakes.
keyFile1.createNewFile();
valueFile1.createNewFile();
keyFile2.createNewFile();
valueFile2.createNewFile();
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method testRecoverWriter3 ignores exceptional return value of File.createNewFile.

Copilot uses AI. Check for mistakes.

keyFile1.createNewFile();
valueFile1.createNewFile();
keyFile2.createNewFile();
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method testRecoverWriter4 ignores exceptional return value of File.createNewFile.

Copilot uses AI. Check for mistakes.
keyFile1.createNewFile();
valueFile1.createNewFile();
keyFile2.createNewFile();
tempValueFile2.createNewFile();
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method testRecoverWriter4 ignores exceptional return value of File.createNewFile.

Copilot uses AI. Check for mistakes.
+ AbstractTableSizeCacheWriter.TEMP_CACHE_FILE_SUBFIX);

keyFile1.createNewFile();
tempKeyFile2.createNewFile();
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method testRecoverWriter5 ignores exceptional return value of File.createNewFile.

Copilot uses AI. Check for mistakes.

keyFile1.createNewFile();
tempKeyFile2.createNewFile();
tempValueFile2.createNewFile();
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method testRecoverWriter5 ignores exceptional return value of File.createNewFile.

Copilot uses AI. Check for mistakes.
context.getTimePartitionTableSizeQueryContextMap().entrySet()) {
Map<String, Long> tableSizeResultMap = entry.getValue().getTableSizeResultMap();
for (Map.Entry<String, Long> tableSizeEntry : tableSizeResultMap.entrySet()) {
int i = Integer.parseInt(tableSizeEntry.getKey().substring(5));
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential uncaught 'java.lang.NumberFormatException'.

Copilot uses AI. Check for mistakes.
@sonarqubecloud
Copy link

sonarqubecloud bot commented Feb 6, 2026

Quality Gate Failed Quality Gate failed

Failed conditions
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant